NLP and IR Approaches to Monolingual and Multilingual Link Detection
نویسندگان
چکیده
This paper considers several important issues for monolingual and multilingual link detection. The experimental results show that nouns, verbs, adjectives and compound nouns are useful to represent news stories; story expansion is helpful; topic segmentation has a little effect; and a translation model is needed to capture the differences between languages.
منابع مشابه
Learning Multi-faceted Knowledge Graph Embeddings for Natural Language Processing
Knowledge graphs have challenged the existing embedding-based approaches for representing their multifacetedness. To address some of the issues, we have investigated some novel approaches that (i) capture the multilingual transitions on different language-specific versions of knowledge, and (ii) encode the commonly existing monolingual knowledge with important relational properties and hierarch...
متن کاملECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity
To model semantic similarity for multilingual and cross-lingual sentence pairs, we first translate foreign languages into English, and then build an efficient monolingual English system with multiple NLP features. Our system is further supported by deep learning models and our best run achieves the mean Pearson correlation 73.16% in primary track.
متن کاملMultilingual lexicons for related languages
The great increase in work on the lexicon by computational and theoretical linguists throughout the s has concerned itself almost exclusively with monolingual lexicons Meanwhile applied work on multilingual lexicons mostly for machine translation MT has employed monolingual lexicons linked only at the level of semantics In this paper we argue that the traditional MT lexicon architecture while a...
متن کاملMultilingual Relevant Sentence Detection Using Reference Corpus
IR with reference corpus is one approach when dealing with relevant sentences detection, which takes the result of IR as the representation of query (sentence). Lack of information and language difference are two major issues in relevant detection among multilingual sentences. This paper refers to a parallel corpus for information expansion and translation, and introduces different representati...
متن کاملCross-Lingual Validation of Multilingual Wordnets
Incorporating Wordnet or its monolingual followers in modern NLP-based systems already represents a general trend motivated by numerous reports showing significant improvements in the overall performances of these systems. Multilingual wordnets, such as EuroWordNet or BalkaNet, represent one step further with great promises in the domain of multilingual processing. The paper describes one possi...
متن کامل